# Text Visual Question Answering
Git Base Textvqa
MIT
A visual question answering model fine-tuned on the textvqa dataset based on microsoft/git-base-textvqa, excelling at handling image-based question answering tasks involving text
Large Language Model
Transformers Other

G
Hellraiser24
19
0
Git Large Textvqa
MIT
GIT is a vision-language model based on a Transformer decoder, trained with dual conditioning on CLIP image tokens and text tokens, specifically optimized for TextVQA tasks.
Image-to-Text
Transformers Supports Multiple Languages

G
microsoft
62
4
Featured Recommended AI Models